Machine Learning Predicting nearly as well as the best pruning of a decision tree
نویسنده
چکیده
Many algorithms for inferring a decision tree from data involve a two phase process First a very large decision tree is grown which typically ends up over tting the data To reduce over tting in the second phase the tree is pruned using one of a number of available methods The nal tree is then output and used for classi cation on test data In this paper we suggest an alternative approach to the pruning phase Using a given unpruned decision tree we present a new method of making predictions on test data and we prove that our algorithm s performance will not be much worse in a precise technical sense than the predic tions made by the best reasonably small pruning of the given decision tree Thus our procedure is guaranteed to be competitive in terms of the quality of its predictions with any pruning al gorithm We prove that our procedure is very e cient and highly robust Our method can be viewed as a synthesis of two previously studied techniques First we apply Cesa Bianchi et al s results on predicting using expert advice where we view each pruning as an expert to obtain an algorithm that has provably low prediction loss but that is com putationally infeasible Next we generalize and apply a method developed by Buntine and Willems Shtarkov and Tjalkens to derive a very e cient implementation of this procedure
منابع مشابه
Anomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors
Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...
متن کاملPredicting Nearly as well as the best Pruning of a Planar Decision Graph
We design efficient on-line algorithms that predict nearly as well as the best pruning of a planar decision graph. We assume that the graph has no cycles. As in the previous work on decision trees, we implicitly maintain one weight for each of the prunings (exponentially many). The method works for a large class of algorithms that update its weights multiplicatively. It can also be used to desi...
متن کاملبررسی کارایی مدل درختان تصمیمگیری در برآورد رسوبات معلق رودخانهای (مطالعه موردی: حوضه سد ایلام)
The real estimation of the volume of sediments carried by rivers in water projects is very important. In fact, achieving the most important ways to calculate sediment discharge has been considered as the objective of the most research projects. Among these methods, the machine learning methods such as decision trees model (that are based on the principles of learning) can be presented. Decision...
متن کاملExploring Gene Signatures in Different Molecular Subtypes of Gastric Cancer (MSS/ TP53+, MSS/TP53-): A Network-based and Machine Learning Approach
Gastric cancer (GC) is one of the leading causes of cancer mortality, worldwide. Molecular understanding of GC’s different subtypes is still dismal and it is necessary to develop new subtype-specific diagnostic and therapeutic approaches. Therefore developing comprehensive research in this area is demanding to have a deeper insight into molecular processes, underlying these subtypes. In this st...
متن کاملComparative Analysis of Machine Learning Algorithms with Optimization Purposes
The field of optimization and machine learning are increasingly interplayed and optimization in different problems leads to the use of machine learning approaches. Machine learning algorithms work in reasonable computational time for specific classes of problems and have important role in extracting knowledge from large amount of data. In this paper, a methodology has been employed to opt...
متن کامل